Relative Reward Strength Algorithms for Learning

نویسندگان

  • Rahul Simha
  • James F. Kurose
چکیده

We examine a new class of action probability update algorithms for learning automata that use the relative reward strengths of responses from the environment. Speciically, we study update algorithms for S-Model automata in which \recent" environmental responses for each of the actions are retained and used. We prove a convergence result and study the behavior of these automata through simulation. A major result of the paper is that the performance of these algorithms is superior, in several respects, to that of the well-known SL R?I update algorithm. Additional results are presented on the variability of performance, the cost of learning and, in the case of static environments, modiications that result in improved convergence.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Learning by Comparing Immediate Reward

This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate rewards using a variation of Q-Learning algorithm. Unlike the conventional Q-Learning, the proposed algorithm compares current reward with immediate reward of past move and work accordingly. Relative reward based Q-learning is an approach towards interactive learning. Q-Learning is a model free re...

متن کامل

Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results Editor: Leslie Kaelbling

This paper presents a detailed study of average reward reinforcement learning, an undiscounted optimality framework that is more appropriate for cyclical tasks than the much better studied discounted framework. A wide spectrum of average reward algorithms are described, ranging from synchronous dynamic programming methods to several (provably convergent) asyn-chronous algorithms from optimal co...

متن کامل

Manufactured in The Netherlands . Average Reward Reinforcement Learning : Foundations , Algorithms , and Empirical

This paper presents a detailed study of average reward reinforcement learning, an undiscounted optimality framework that is more appropriate for cyclical tasks than the much better studied discounted framework. A wide spectrum of average reward algorithms are described, ranging from synchronous dynamic programming methods to several (provably convergent) asyn-chronous algorithms from optimal co...

متن کامل

Spatial attention Jiang , Sha , Remington 1

This study documented the relative strength of task goals, visual statistical learning, and monetary reward in guiding spatial attention. Using a difficult T-among-L search task, we cued spatial attention to one visual quadrant by (i) instructing people to prioritize it (goal-driven attention), (ii) placing the target frequently there (location probability learning), or (iii) associating that q...

متن کامل

Exploiting Multiple Secondary Reinforcers in Policy Gradient Reinforcement Learning

Most formulations of Reinforcement Learning depend on a single reinforcement reward value to guide the search for the optimal policy solution. If observation of this reward is rare or expensive, converging to a solution can be impractically slow. One way to exploit additional domain knowledge is to use more readily available, but related quantities as secondary reinforcers to guide the search t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1989